{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0f973def-80e9-40e5-91b1-86055b9f9646",
   "metadata": {},
   "source": [
    "# 希尔排序"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "afd9ba9e-1339-41a4-a7f2-0db13c4974d6",
   "metadata": {},
   "source": [
    "希尔排序也称“递减增量排序”，它对插入排序做了改进，将列表分成数个子列表，并对每一个子列表应用插入排序。如何切分列表是希尔排序的关键——并不是连续切分，而是使用增量 i（有时称作步长）选取所有间隔为 i 的元素组成子列表。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "667bd698-021a-4955-a42d-01ae0f2cef97",
   "metadata": {},
   "source": [
    "以下图中的列表为例，这个列表有 9 个元素。如果增量为 3，就有 3 个子列表，每个都可以应用插入排序，结果如后面的图所示。尽管列表仍然不算完全有序，但通过给子列表排序，我们已经让元素离它们的最终位置更近了。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9af36a50-002f-4d7b-87a6-6e9a9adb7b2d",
   "metadata": {},
   "source": [
    "![希尔排序1](shell_sort1.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d005e3ff-e627-4b14-9cc6-29e96012cc32",
   "metadata": {},
   "source": [
    "![希尔排序2](shell_sort2.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5dba9837-ab17-42fd-9cfc-9e66fb40a687",
   "metadata": {},
   "source": [
    "下展示了最终的插入排序过程。由于有了之前的子列表排序，因此总移动次数已经减少了。本例只需要再移动 4 次。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6ef5ebe-30a7-4a8a-92de-906b78ba6861",
   "metadata": {},
   "source": [
    "![希尔排序3](shell_sort3.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9149bc05-5614-42de-9c2b-bd2cc77f680a",
   "metadata": {},
   "source": [
    "如前所述，如何切分列表是希尔排序的关键。后面的代码实现中的函数采用了另一组增量。先为 $\\frac{n}{2}$个子列表排序，接着是 $\\frac{n}{4}$ 个子列表。最终，整个列表由基本的插入排序算法排好序。下图展示了采用这种增量后的第一批子列表。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d9cd5c4-37bd-473e-bc6a-919c5804cc96",
   "metadata": {},
   "source": [
    "![希尔排序4](shell_sort4.jpg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "35b0d13e-1e61-4517-8228-03cfcdb1b41a",
   "metadata": {},
   "outputs": [],
   "source": [
    "def shell_sort(alist):\n",
    "    sublistcount = len(alist) // 2\n",
    "    while sublistcount > 0:\n",
    "        for startposition in range(sublistcount):\n",
    "            gap_insertion_sort(alist, startposition, sublistcount)\n",
    "\n",
    "        sublistcount = sublistcount // 2\n",
    "\n",
    "def gap_insertion_sort(alist, start, gap): \n",
    "    for i in range(start+gap, len(alist), gap):\n",
    "        currentvalue = alist[i]\n",
    "        position = i\n",
    "\n",
    "        while position >= gap and alist[position-gap] > currentvalue:\n",
    "            alist[position] = alist[position-gap]\n",
    "            position = position-gap\n",
    "\n",
    "        alist[position] = currentvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "ccc7d68a-48c1-4f87-baba-c901458433fd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "before sorting: [54, 26, 93, 17, 77, 31, 44, 55, 20]\n",
      "after sorting: [17, 20, 26, 31, 44, 54, 55, 77, 93]\n"
     ]
    }
   ],
   "source": [
    "a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]\n",
    "print(f\"before sorting: {a_list}\")\n",
    "shell_sort(a_list)\n",
    "print(f\"after sorting: {a_list}\")\n",
    "assert a_list == sorted(a_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "6a2f4ff5-9010-4c95-9cb5-002ce5a5b81d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from random import randint, seed\n",
    "seed(13)\n",
    "lst_to_sort = [randint(100, 999) for _ in range(1000)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "8ae7563b-7c9f-4958-b099-a7327a18f39d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "634 μs ± 5.23 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit shell_sort(lst_to_sort)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37c36da7-b385-4c00-b02d-05f2cc645d8d",
   "metadata": {},
   "source": [
    "希尔排序的时间复杂度大概介于 $O(n)$ 和 $O(n^2)$ 之间。若采用上面代码中的增量，则时间复杂度是 $O(n^2)$ 。通过改变增量，比如采用 $2^k-1 (1, 3, 7, 15, 31, \\cdots)$，希尔排序的时间复杂度可以达到 $O(n^{\\frac{3}{2}})$。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6ebc630-3cb6-42bd-af9b-697b15530f41",
   "metadata": {},
   "source": [
    "### 参考文档\n",
    "《Python数据结构与算法分析（第2版）》：5.3.4 希尔排序"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
